Automatic alignment of medical vs. general terminologies

نویسندگان

  • Laura Diosan
  • Alexandrina Rogozan
  • Jean-Pierre Pécuchet
چکیده

We propose an original automatic alignment of definitions taken from different dictionaries that could be associated to the same concept although they may have different labels. The alignment between a specialized terminology used by the librarians to index concepts and a general vocabulary employed by a neophyte user in order to retrieve documents on Internet, will certainly improve the performances of the information retrieval process. The selected framework is a medical one. We propose a terminology alignment by an SVM classifier trained on a compact, but relevant representation of such definition pair by several similarity measures and the length of definitions. Three syntactic levels are investigated: Nouns, Nouns-Adjectives, and Nouns-Adjectives-Verbs. Our aim is to show how the combination of similarity measures offers a better semantic access to the document content than only one measure and it improves the performances of the automatic alignment. The results obtained on the test set show the relevance of our approach, as the F-measure reaches 88%. However, this conclusion should be validated on

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Alignment of Medical Terminologies with General Dictionaries for an Efficient Information Retrieval

aBstRact The automatic alignment between a specialized terminology used by librarians in order to index concepts and a general vocabulary employed by a neophyte user in order to retrieve medical information will certainly improve the performances of the search process, this being one of the purposes of the ANR VODEL project. The authors propose an original automatic alignment of definitions tak...

متن کامل

Multi-terminology indexing for the assignment of MeSH descriptors to medical abstracts in French

BACKGROUND To facilitate information retrieval in the biomedical domain, a system for the automatic assignment of Medical Subject Headings to documents curated by an online quality-controlled health gateway was implemented. The French Multi-Terminology Indexer (F-MTI) implements a multiterminology approach using nine main medical terminologies in French and the mappings between them. OBJECTIV...

متن کامل

Evaluating alignment quality between iconic language and reference terminologies using similarity metrics

BACKGROUND Visualization of Concepts in Medicine (VCM) is a compositional iconic language that aims to ease information retrieval in Electronic Health Records (EHR), clinical guidelines or other medical documents. Using VCM language in medical applications requires alignment with medical reference terminologies. Alignment from Medical Subject Headings (MeSH) thesaurus and International Classifi...

متن کامل

Using Word Alignment to Extend Multilingual Medical Terminologies

Medical terminologies such as those provided in the UMLS are never exhaustive and there is a constant need to enrich them, especially in terms of multilinguality. We present a methodology to acquire new French translations of English medical terms based on word alignment in a parallel corpus — i.e. pairing of corresponding words. We automatically collected a 27.7-million-word parallel, English-...

متن کامل

Automatic alignment and phonetic studies: Comparing alignment systems for the analysis of the schwa

Three automatic alignment systems are compared in their adequacy to account for the vowel schwa as compared to a manual transcription obtained from two judges. Error rates and types are analysed, as well as the linguistic factors involved. The type of surrounding consonants and the duration of schwa influence the decisions of the three systems. Moreover, the systems behave differently, dependin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008